51 research outputs found

    Identifying metabolites by integrating metabolome databases with mass spectrometry cheminformatics.

    Get PDF
    Novel metabolites distinct from canonical pathways can be identified through the integration of three cheminformatics tools: BinVestigate, which queries the BinBase gas chromatography-mass spectrometry (GC-MS) metabolome database to match unknowns with biological metadata across over 110,000 samples; MS-DIAL 2.0, a software tool for chromatographic deconvolution of high-resolution GC-MS or liquid chromatography-mass spectrometry (LC-MS); and MS-FINDER 2.0, a structure-elucidation program that uses a combination of 14 metabolome databases in addition to an enzyme promiscuity library. We showcase our workflow by annotating N-methyl-uridine monophosphate (UMP), lysomonogalactosyl-monopalmitin, N-methylalanine, and two propofol derivatives

    Optimal neighborhood indexing for protein similarity search

    Get PDF
    Background: Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet.\ud \ud Results: The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum.\ud \ud Conclusions: We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction

    An iterative block-shifting approach to retention time alignment that preserves the shape and area of gas chromatography-mass spectrometry peaks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Metabolomics, petroleum and biodiesel chemistry, biomarker discovery, and other fields which rely on high-resolution profiling of complex chemical mixtures generate datasets which contain millions of detector intensity readings, each uniquely addressed along dimensions of <it>time </it>(<it>e.g.</it>, <it>retention time </it>of chemicals on a chromatographic column), a <it>spectral value </it>(<it>e.g., mass-to-charge ratio </it>of ions derived from chemicals), and the <it>analytical run number</it>. They also must rely on data preprocessing techniques. In particular, inter-run variance in the retention time of chemical species poses a significant hurdle that must be cleared before feature extraction, data reduction, and knowledge discovery can ensue. <it>Alignment methods</it>, for calibrating retention reportedly (and in our experience) can misalign matching chemicals, falsely align distinct ones, be unduly sensitive to chosen values of input parameters, and result in distortions of peak shape and area.</p> <p>Results</p> <p>We present an iterative block-shifting approach for retention-time calibration that detects chromatographic features and qualifies them by retention time, spectrum, and the effect of their inclusion on the quality of alignment itself. Mass chromatograms are aligned pairwise to one selected as a reference. In tests using a 45-run GC-MS experiment, block-shifting reduced the absolute deviation of retention by greater than 30-fold. It compared favourably to COW and XCMS with respect to alignment, and was markedly superior in preservation of peak area.</p> <p>Conclusion</p> <p>Iterative block-shifting is an attractive method to align GC-MS mass chromatograms that is also generalizable to other two-dimensional techniques such as HPLC-MS.</p

    Optimizing substitution matrix choice and gap parameters for sequence alignment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments.</p> <p>Results</p> <p>POP is compared to a recent method due to Kim and Kececioglu and found to achieve from 0.2% to 1.3% higher accuracies on pair-wise benchmarks extracted from BALIBASE. The VTML matrix series is shown to be the most accurate on several global pair-wise alignment benchmarks, with VTML200 giving best or close to the best performance in all tests. BLOSUM matrices are found to be slightly inferior, even with the marginal improvements in the bug-fixed RBLOSUM series. The PAM series is significantly worse, giving accuracies typically 2% less than VTML. Integer rounding is found to cause slight degradations in accuracy. No evidence is found that selecting a matrix based on sequence divergence improves accuracy, suggesting that the use of this heuristic in CLUSTALW may be ineffective. Using VTML200 is found to improve the accuracy of CLUSTALW by 8% on BALIBASE and 5% on PREFAB.</p> <p>Conclusion</p> <p>The hypothesis that more accurate alignments of distantly related sequences may be achieved using low-identity matrices is shown to be false for commonly used matrix types. Source code and test data is freely available from the author's web site at <url>http://www.drive5.com/pop</url>.</p

    Structural Elucidation and Functional Characterization of the Hyaloperonospora arabidopsidis Effector Protein ATR13

    Get PDF
    The oomycete Hyaloperonospora arabidopsidis (Hpa) is the causal agent of downy mildew on the model plant Arabidopsis thaliana and has been adapted as a model system to investigate pathogen virulence strategies and plant disease resistance mechanisms. Recognition of Hpa infection occurs when plant resistance proteins (R-genes) detect the presence or activity of pathogen-derived protein effectors delivered to the plant host. This study examines the Hpa effector ATR13 Emco5 and its recognition by RPP13-Nd, the cognate R-gene that triggers programmed cell death (HR) in the presence of recognized ATR13 variants. Herein, we use NMR to solve the backbone structure of ATR13 Emco5, revealing both a helical domain and a disordered internal loop. Additionally, we use site-directed and random mutagenesis to identify several amino acid residues involved in the recognition response conferred by RPP13-Nd. Using our structure as a scaffold, we map these residues to one of two surface-exposed patches of residues under diversifying selection. Exploring possible roles of the disordered region within the ATR13 structure, we perform domain swapping experiments and identify a peptide sequence involved in nucleolar localization. We conclude that ATR13 is a highly dynamic protein with no clear structural homologues that contains two surface-exposed patches of polymorphism, only one of which is involved in RPP13-Nd recognition specificity

    Plato's Cave Algorithm: Inferring Functional Signaling Networks from Early Gene Expression Shadows

    Get PDF
    Improving the ability to reverse engineer biochemical networks is a major goal of systems biology. Lesions in signaling networks lead to alterations in gene expression, which in principle should allow network reconstruction. However, the information about the activity levels of signaling proteins conveyed in overall gene expression is limited by the complexity of gene expression dynamics and of regulatory network topology. Two observations provide the basis for overcoming this limitation: a. genes induced without de-novo protein synthesis (early genes) show a linear accumulation of product in the first hour after the change in the cell's state; b. The signaling components in the network largely function in the linear range of their stimulus-response curves. Therefore, unlike most genes or most time points, expression profiles of early genes at an early time point provide direct biochemical assays that represent the activity levels of upstream signaling components. Such expression data provide the basis for an efficient algorithm (Plato's Cave algorithm; PLACA) to reverse engineer functional signaling networks. Unlike conventional reverse engineering algorithms that use steady state values, PLACA uses stimulated early gene expression measurements associated with systematic perturbations of signaling components, without measuring the signaling components themselves. Besides the reverse engineered network, PLACA also identifies the genes detecting the functional interaction, thereby facilitating validation of the predicted functional network. Using simulated datasets, the algorithm is shown to be robust to experimental noise. Using experimental data obtained from gonadotropes, PLACA reverse engineered the interaction network of six perturbed signaling components. The network recapitulated many known interactions and identified novel functional interactions that were validated by further experiment. PLACA uses the results of experiments that are feasible for any signaling network to predict the functional topology of the network and to identify novel relationships

    A Network Analysis of the Human T-Cell Activation Gene Network Identifies Jagged1 as a Therapeutic Target for Autoimmune Diseases

    Get PDF
    Understanding complex diseases will benefit the recognition of the properties of the gene networks that control biological functions. Here, we set out to model the gene network that controls T-cell activation in humans, which is critical for the development of autoimmune diseases such as Multiple Sclerosis (MS). The network was established on the basis of the quantitative expression from 104 individuals of 20 genes of the immune system, as well as on biological information from the Ingenuity database and Bayesian inference. Of the 31 links (gene interactions) identified in the network, 18 were identified in the Ingenuity database and 13 were new and we validated 7 of 8 interactions experimentally. In the MS patients network, we found an increase in the weight of gene interactions related to Th1 function and a decrease in those related to Treg and Th2 function. Indeed, we found that IFN-ß therapy induces changes in gene interactions related to T cell proliferation and adhesion, although these gene interactions were not restored to levels similar to controls. Finally, we identify JAG1 as a new therapeutic target whose differential behaviour in the MS network was not modified by immunomodulatory therapy. In vitro treatment with a Jagged1 agonist peptide modulated the T-cell activation network in PBMCs from patients with MS. Moreover, treatment of mice with experimental autoimmune encephalomyelitis with the Jagged1 agonist ameliorated the disease course, and modulated Th2, Th1 and Treg function. This study illustrates how network analysis can predict therapeutic targets for immune intervention and identified the immunomodulatory properties of Jagged1 making it a new therapeutic target for MS and other autoimmune diseases

    Optimization in computational systems biology

    Get PDF
    Optimization aims to make a system or design as effective or functional as possible. Mathematical optimization methods are widely used in engineering, economics and science. This commentary is focused on applications of mathematical optimization in computational systems biology. Examples are given where optimization methods are used for topics ranging from model building and optimal experimental design to metabolic engineering and synthetic biology. Finally, several perspectives for future research are outlined

    Defects in Mitochondrial Dynamics and Metabolomic Signatures of Evolving Energetic Stress in Mouse Models of Familial Alzheimer's Disease

    Get PDF
    The identification of early mechanisms underlying Alzheimer's Disease (AD) and associated biomarkers could advance development of new therapies and improve monitoring and predicting of AD progression. Mitochondrial dysfunction has been suggested to underlie AD pathophysiology, however, no comprehensive study exists that evaluates the effect of different familial AD (FAD) mutations on mitochondrial function, dynamics, and brain energetics.We characterized early mitochondrial dysfunction and metabolomic signatures of energetic stress in three commonly used transgenic mouse models of FAD. Assessment of mitochondrial motility, distribution, dynamics, morphology, and metabolomic profiling revealed the specific effect of each FAD mutation on the development of mitochondrial stress and dysfunction. Inhibition of mitochondrial trafficking was characteristic for embryonic neurons from mice expressing mutant human presenilin 1, PS1(M146L) and the double mutation of human amyloid precursor protein APP(Tg2576) and PS1(M146L) contributing to the increased susceptibility of neurons to excitotoxic cell death. Significant changes in mitochondrial morphology were detected in APP and APP/PS1 mice. All three FAD models demonstrated a loss of the integrity of synaptic mitochondria and energy production. Metabolomic profiling revealed mutation-specific changes in the levels of metabolites reflecting altered energy metabolism and mitochondrial dysfunction in brains of FAD mice. Metabolic biomarkers adequately reflected gender differences similar to that reported for AD patients and correlated well with the biomarkers currently used for diagnosis in humans.Mutation-specific alterations in mitochondrial dynamics, morphology and function in FAD mice occurred prior to the onset of memory and neurological phenotype and before the formation of amyloid deposits. Metabolomic signatures of mitochondrial stress and altered energy metabolism indicated alterations in nucleotide, Krebs cycle, energy transfer, carbohydrate, neurotransmitter, and amino acid metabolic pathways. Mitochondrial dysfunction, therefore, is an underlying event in AD progression, and FAD mouse models provide valuable tools to study early molecular mechanisms implicated in AD
    • …
    corecore